Loading and Visualization#

In this section, we illustrate how to load and visualize the pseudoPAGES2k dataset with cfr.

Required data to complete this tutorial:

[1]:
%load_ext autoreload
%autoreload 2

import cfr
import xarray as xr

import os
os.chdir('/glade/u/home/fengzhu/Github/cfr/docsrc/notebooks/')

Load the pseudoPAGES2k dataset with xarray#

By default, we may load a netCDF file with xarray to have a check of the data structure:

[2]:
ds = xr.open_dataset('./data/ppwn_SNRinf_rta.nc', use_cftime=True)
ds
[2]:
<xarray.Dataset>
Dimensions:  (time: 1156)
Coordinates:
  * time     (time) object 0850-01-01 00:00:00 ... 2005-01-01 00:00:00
Data variables: (12/558)
    NAm_153  (time) float64 ...
    NAm_165  (time) float64 ...
    Asi_178  (time) float64 ...
    Asi_174  (time) float64 ...
    Asi_198  (time) float64 ...
    NAm_145  (time) float64 ...
    ...       ...
    Ocn_169  (time) float64 ...
    Asi_201  (time) float64 ...
    Asi_179  (time) float64 ...
    Arc_014  (time) float64 ...
    Ocn_071  (time) float64 ...
    Ocn_072  (time) float64 ...
[3]:
ds['NAm_001']
[3]:
<xarray.DataArray 'NAm_001' (time: 1156)>
[1156 values with dtype=float64]
Coordinates:
  * time     (time) object 0850-01-01 00:00:00 ... 2005-01-01 00:00:00
Attributes:
    lat:         35.3
    lon:         248.6
    elev:        nan
    ptype:       tree.TRW
    dt:          1.0
    time_name:   Time
    time_unit:   yr
    value_name:  trsgi
    value_unit:  NA

We see that this netCDF file has multiple data variables named after proxy IDs. Each variable comes with the below fundamental attributes:

  • lat: the latitude of the site

  • lon: the longitude of the site

  • ptype: the proxy type

  • dt: step of the time axis, i.e., temporal resolution

  • time_name: name of the time axis

  • time_unit: unit of the time axis

  • value_name: unit of the value axis

  • value_unit: unit of the value axis

With this certain format, the netCDF file is cfr ready.

Load the pseudoPAEGS2k dataset with cfr#

The cfr.ProxyDatabase class comes with a .load_nc() method that can help us load a proxy database from a netCDF file following the certain format shown above:

[4]:
# load the pseudoPAGES2k database from a netCDF file
pdb = cfr.ProxyDatabase().load_nc('./data/ppwn_SNRinf_rta.nc')

Visualize the pseudoPAGES2k dataset#

Once the netCDF file is loaded as a cfr.ProxyDatabase, we can easily visualize the dataset with the .plot() method:

[5]:
fig, ax = pdb.plot()
../_images/notebooks_pp2k-pdb-load-viz_9_0.png

We may also plot map along with the count of the records:

[7]:
fig, ax = pdb.plot(plot_count=True)
../_images/notebooks_pp2k-pdb-load-viz_11_0.png

Since the dataset starts from 850 AD, we may adjust the x-axis utilizing the matplotlib methods:

[6]:
fig, ax = pdb.plot(plot_count=True)
ax['count'].set_xlim(800, 2000)
[6]:
(800.0, 2000.0)
../_images/notebooks_pp2k-pdb-load-viz_13_1.png

Access and visualize a specific record#

A specific record can be accessed by its proxy ID:

[8]:
pobj = pdb['NAm_001']
pobj
[8]:
<cfr.proxy.ProxyRecord at 0x2abc9a7c58b0>

This returned object is defined by cfr.ProxyRecord, which comes with several attributes such as: - time: the time axis - value: the value axis - lat: the latitude of the site - lon: the longitude of the site - (… other metadata)

For instance, to access the record series:

[9]:
print('time axis:', pobj.time)
print('value axis:', pobj.value)
time axis: [ 850.  851.  852. ... 2000. 2001. 2002.]
value axis: [1.0390625 0.796875  0.796875  ... 0.75      0.453125  0.78125  ]

Now that we have the cfr.ProxyRecord object, we can easily visualize the record utilizing the .plot() method:

[10]:
fig, ax = pobj.plot()
../_images/notebooks_pp2k-pdb-load-viz_19_0.png

We may slice the record to zoom in and out. For instance, let us check the instrumental period:

[11]:
fig, ax = pobj.slice([1850, 2000]).plot()
../_images/notebooks_pp2k-pdb-load-viz_21_0.png

We may also slice with a shortcut using strings of years:

[12]:
fig, ax = pobj['1850':'2000'].plot()
../_images/notebooks_pp2k-pdb-load-viz_23_0.png

This shortcut also supports time step. For instance, to display the data points every 10 years between 1850 AD and 2000 AD:

[13]:
fig, ax = pobj['1850':'2000':'10'].plot(marker='o')
../_images/notebooks_pp2k-pdb-load-viz_25_0.png

Check the proxy IDs on an interactive map#

One may ask “How do I know the proxy IDs?”

A cfr.ProxyDatabase object also comes with a .plotly() method that can help us check proxy IDs on an interactive map.

The below cell cannot be rendered on a webpage, but should execute in a local Jupyter notebook. It should display an interactive map, and by hovering the mouse over each site marker, one may check the metadata of a specific site, including:

  • pid (proxy ID)

  • ptype

  • lat

  • lon

Take a try by yourself!

[14]:
pdb.plotly()